Differentially Private Random Decision Forests using Smooth Sensitivity

نویسندگان

  • Sam Fletcher
  • Md Zahidul Islam
چکیده

We propose a new differentially-private decision forest algorithm that minimizes both the number of queries required, and the sensitivity of those queries. To do so, we build an ensemble of random decision trees that avoids querying the private data except to find the majority class label in the leaf nodes. Rather than using a count query to return the class counts like the current state-ofthe-art, we use the Exponential Mechanism to only output the class label itself. This drastically reduces the sensitivity of the query – often by several orders of magnitude – which in turn reduces the amount of noise that must be added to preserve privacy. Our improved sensitivity is achieved by using “smooth sensitivity”, which takes into account the specific data used in the query rather than assuming the worst-case scenario. We also extend work done on the optimal depth of random decision trees to handle continuous features, not just discrete features. This, along with several other improvements, allows us to create a differentially private decision forest with substantially higher predictive power than the current state-of-the-art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially- and non-differentially-private random decision trees

We consider supervised learning with random decision trees, where the tree construction is completely random. The method was used as a heuristic working well in practice despite the simplicity of the setting, but with almost no theoretical guarantees. The goal of this paper is to shed new light on the entire paradigm. We provide strong theoretical guarantees regarding learning with random decis...

متن کامل

Differentially-Private Learning of Low Dimensional Manifolds

In this paper, we study the problem of differentially-private learning of low dimensional manifolds embedded in high dimensional spaces. The problems one faces in learning in high dimensional spaces are compounded in differentially-private learning. We achieve the dual goals of learning the manifold while maintaining the privacy of the dataset by constructing a differentially-private data struc...

متن کامل

Barriers to Black-Box Constructions of Traitor Tracing Systems

Reducibility between different cryptographic primitives is a fundamental problem in modern cryptography. As one of the primitives, traitor tracing systems help content distributors recover the identities of users that collaborated in the pirate construction by tracing pirate decryption boxes. We present the first negative result on designing efficient traitor tracing systems via black-box const...

متن کامل

Motivations for recreating on farmlands, private forests, and state or national parks.

This study explores the importance of different motivations to visit three types of recreational settings--farms, private forests, and state or national parks. Data were collected via a mail-back questionnaire administered to a stratified random sample of households in Missouri (USA). Descriptive and inferential statistics reveal both similarities and discontinuities in motivations for visiting...

متن کامل

Identifying predictive markers of chemosensitivity of breast cancer with random forests

Several gene signatures have been identified to build predictors of chemosensitivity for breast cancer. It is crucial to understand how each gene in a signature contributes to the prediction, i.e., to make the prediction model interpretable instead of using it as a black box. We utilized Random Forests (RFs) to build two interpretable predictors of pathologic complete response (pCR) based on tw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 78  شماره 

صفحات  -

تاریخ انتشار 2017